Collectively Representing Semi-Structured Data from the Web
نویسندگان
چکیده
In this paper, we propose a single lowdimensional representation of a large collection of table and hyponym data, and show that with a small number of primitive operations, this representation can be used effectively for many purposes. Specifically we consider queries like set expansion, class prediction etc. We evaluate our methods on publicly available semi-structured datasets from the Web.
منابع مشابه
Georeferencing Semi-Structured Place-Based Web Resources Using Machine Learning
In recent years, the shared content on the web has had significant growth. A great part of these information are publicly available in the form of semi-strunctured data. Moreover, a significant amount of these information are related to place. Such types of information refer to a location on the earth, however, they do not contain any explicit coordinates. In this research, we tried to georefer...
متن کاملMining Association Rules from Semi-Structured Data
Despite the growing popularity of semi-structured data such as Web documents, most knowledge discovery research has focused on databases containing well structured data. In this paper, we try to find useful information from semistructured data. In our approach, we begin by representing semi-structured data in a prototype-based approach. We then detect the most typical common structure of semist...
متن کاملSemi-Structured Data Extraction from Heterogeneous Sources
This paper concerns the extraction of semi-structured data from Web pages generated from multiple on-line services. This task is addressed by representing the schemas for semi-structured data and crafting generic wrappers based on the schemas. We introduce a hybrid representation method for schemas of semi-structured data, consisting of a concept hierarchy and a set of knowledge unit frames. A ...
متن کاملEnhanced Database Migration Technique Using XML
XML becomes a de facto standard for representing and exchanging data over the Web. It is designed to structure and carry data in a sensible way, thus helping programmers and web developers manipulate the data easily and efficiently. With the tremendous growth of XML data on the Internet, an efficient database system becomes necessary to maintain it. There are many Internet applications that pro...
متن کاملFLEXIS – A FleXible Information System based on XML Data Model
Semi-structured data has gained a lot of prominence in the recent past, especially after the realization of the inadequacy of HTML for information representation on the WEB. Semi-structured data is characterized by lack of rigid structure (schema) and evolving structure. XML has been adopted as a practical model for representing semi-structured data and also as the standard for data exchange on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012